Workshop Automated Content Analysis

Session 5: Deep Learning

Author

Johannes B. Gruber

Published

July 7, 2023

Introduction

In the last session, we worked with some pre-trained word embedding models and found out that they still come with a lot of problems. The next breakthrough in the development of embeddings were transformer models (Vaswani et al. 2017). They have several advantages which lead to models that

  • Can take larger contexts into account when training (remember the limited window we used before)
  • Can be trained much more efficiently and can hence take in even more texts
  • Can have several embeddings for each word depending on the context, finally moving away from the bag-of-words paradigm
  • Can be fine-tuned on new data which contains different vocabulary

However, compared to other approaches like naive bayes or svm algorithms, we are still relatively early for this new technology. The step that happened about 10-15 years ago when many of the things were implemented in R has not really happened yet. Meanwhile, the models also only run on new powerful hardware since the required matrix computations are slow on CPUs and need a GPU instead.

So this session is currently more a preview than an actual hands-on tutorial.

R wrappers for large language models

Another problem with LLMs is that they are predominanlty controlled from Python. R has excellent wrappers for languages like C, C++, Rust or Java, but Python still falls a little behind in terms of comfort of usage. Packages like spacyr and grafzahl try to employ Python anyway through the reticulate compatibility layer. (They do still have some issues to figure out.)

Let’s see how we can use grafzahl for classifying the imdb data again, which we used in the supervised machine learning session. As a first step, we have to set up the package:

install.packages("grafzahl")
grafzahl::setup_grafzahl(cuda = grafzahl::detect_cuda())

This installs a small version of Python on your systems and sets up R to use it. The cuda argument essentially controls whether grafzahl can access your NVIDIA graphics card. If you do not have one of these cards, you can still use the underlying models, but they will take very long to do anything.

The steps for SML are still the same:

  1. preprocessing the incoming text

We do not really have to do anything here as transformer models already take care of most steps and the dataset we are using is already quite clean.

  1. splitting the dataset into training and a test set (which is not included in the model and just used for validation)
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.2     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.2     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(tidymodels)
── Attaching packages ────────────────────────────────────── tidymodels 1.1.0 ──
✔ broom        1.0.4     ✔ rsample      1.1.1
✔ dials        1.2.0     ✔ tune         1.1.1
✔ infer        1.0.4     ✔ workflows    1.1.3
✔ modeldata    1.1.0     ✔ workflowsets 1.0.1
✔ parsnip      1.1.0     ✔ yardstick    1.2.0
✔ recipes      1.0.6     
── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
✖ scales::discard() masks purrr::discard()
✖ dplyr::filter()   masks stats::filter()
✖ recipes::fixed()  masks stringr::fixed()
✖ dplyr::lag()      masks stats::lag()
✖ yardstick::spec() masks readr::spec()
✖ recipes::step()   masks stats::step()
• Search for functions across packages at https://www.tidymodels.org/find/
imdb <- readRDS("data/imdb.rds")
set.seed(1)
split <- initial_split(
  data = imdb, 
  prop = 3 / 4,   # the prop is the default, I just wanted to make that visible
  strata = label  # this makes sure the prevalence of labels is still the same afterwards
) 
imdb_train <- training(split)
imdb_test <- testing(split)
  1. fitting (or training) the model
if (!file.exists("data/5_imdb_distilbert.rds")) {
  library(grafzahl)
  model <- grafzahl(x = imdb_train$text,
                    y = imdb_train$label,
                    model_name = "distilbert-base-uncased",
                    output_dir = "model",
                    cuda = TRUE,
                    num_train_epochs = 1,
                    train_size = 1)
  saveRDS(model, "data/5_imdb_distilbert.rds")
} else {
  model <- readRDS("data/5_imdb_distilbert.rds")
}

If you do not have a graphics card, this step will take a long time.

  1. using the test set to compare predictions against the real values for validation
library(gt)
if (!file.exists("data/5_imdb_distilbert_prediction.rds")) {
  library(grafzahl)
  estimates <- predict(model, newdata = imdb_test$text)
  saveRDS(estimates, "data/5_imdb_distilbert_prediction.rds")
} else {
  estimates <- readRDS("data/5_imdb_distilbert_prediction.rds")
}
imdb_prediction <- imdb_test |>  
  bind_cols(estimate = estimates) |> 
  mutate(truth = factor(label),
         estimate = factor(estimate))

my_metrics <- metric_set(accuracy, kap, precision, recall, f_meas)

my_metrics(imdb_prediction, truth = truth, estimate = estimate) |>  
  gt() |>  
  data_color(
    columns = .estimate,
    fn = scales::col_numeric(
      palette = c("red", "orange", "green"),
      domain = c(0, 1)
    )
  )
.metric .estimator .estimate
accuracy binary 0.8650400
kap binary 0.7300800
precision binary 0.8774194
recall binary 0.8486400
f_meas binary 0.8627898

Working with Python in R

Why combine Python with R?

Why not just switch to Python?

  1. If you’re here, you probably already know R so why re-learn things from scratch?
  2. R is a programming language specifically for statistics with some great built-in functionality that you would miss in Python.
  3. R has absolutely outstanding packages for data science with no drop-in replacement in Python (e.g., ggplot2, dplyr, tidytext).

Why not just stick with R then?

  1. Newer models and methods in machine learning are often Python only (as advancements are made by big companies who rely on Python)
  2. You might want to collaborate with someone who uses Python and need to run their code
  3. Learning a new (programming) language is always good to extend your skills (also in your the language(s) you already know)

Getting started

We start by installing the necessary Python packages, for which you should use a virtual environment (so we set that one up first).

Create a Virtual Environment

Before you load reticulate for the first time, we need to create a virtual environment. This is a folder in your project directory with a link to Python and you the packages you want to use in this project. Why?

  • Packages (or their dependencies) on the Python Package Index can be incompatible with each other – meaning you can break things by updating.

  • Your operating system might keep older versions of some packages around, which you means you could break your OS by and accidental update!

  • This also adds to projects being reproducible on other systems, as you keep track of the specific version of each package used in your project (you could do this in R with the renv package).

To grab the correct version of Python to link to in virtual environment:

if (R.Version()$os == "mingw32") {
  system("where python") # for Windows
} else {
  system("whereis python")
}

I choose the main Python installation in “/usr/bin/python” and use it as the base for a virtual environment. If you don’t have any Python version on your system, you can install one with reticulate::install_miniconda().

# I added this if condition to not accidentally overwrite the environment when rerunning the notebook
if (!reticulate::virtualenv_exists(envname = "./python-env/")) {
  reticulate::virtualenv_create("./python-env/", python = "/usr/bin/python")
  # for Windows the path is usually "C:/Users/{user}/AppData/Local/r-miniconda/python.exe"
}
reticulate::virtualenv_exists(envname = "./python-env/")
[1] TRUE

reticulate is supposed to automatically pick this up when started, but to make sure, I set the environment variable RETICULATE_PYTHON to the binary of Python in the new environment:

if (R.Version()$os == "mingw32") {
  python_path <- file.path(getwd(), "python-env/Scripts/python.exe")
} else {
  python_path <- file.path(getwd(), "python-env/bin/python")
}
file.exists(python_path)
[1] TRUE
Sys.setenv(RETICULATE_PYTHON = python_path)

Optional: make this persist restarts of RStudio by saving the environment variable into an .Renviron file (otherwise the Sys.setenv() line above needs to be in every script):

# open the .Renviron file
usethis::edit_r_environ(scope = "project")
# or directly append it with the necessary line
readr::write_lines(
  x = paste0("RETICULATE_PYTHON=", python_path),
  file = ".Renviron",
  append = TRUE
)

Now reticulate should now pick up the correct binary in the project folder:

library(reticulate)
py_config()
python:         /home/johannes/Documents/Github/aca_vienna/python-env/bin/python
libpython:      /usr/lib/libpython3.11.so
pythonhome:     /home/johannes/Documents/Github/aca_vienna/python-env:/home/johannes/Documents/Github/aca_vienna/python-env
version:        3.11.3 (main, Jun  5 2023, 09:32:32) [GCC 13.1.1 20230429]
numpy:          /home/johannes/Documents/Github/aca_vienna/python-env/lib/python3.11/site-packages/numpy
numpy_version:  1.24.4

NOTE: Python version was forced by RETICULATE_PYTHON

Installing Packages

reticulate::py_install() installs package similar to install.packages(). Let’s install the packages we need:

reticulate::py_install(c(
  "scikit-learn<1.3.0",
  "bertopic==0.14.1", # this one requires some build tools not usually available on Windows, comment out to install the rest
  "sentence_transformers",
  "simpletransformers"
))

Recreating grafzahl from Python

(if you do not have an NVIDIA graphics card, following these steps locally does not make a lot of sense. You can instead run this on this Google colab I created.) To demonstrate the workflow for reticulate, we do the same analysis as above, but rely on Python functions:

import pandas as pd
import os
import torch
from simpletransformers.classification import ClassificationModel

# args copied from grafzahl, learn more at https://simpletransformers.ai/docs/usage/
model_args = {
  "num_train_epochs": 1, # increase for multiple runs, which can yield better performance
  "use_multiprocessing": False,
  "use_multiprocessing_for_evaluation": False,
  "overwrite_output_dir": True,
  "reprocess_input_data":  True,
  "overwrite_output_dir":  True,
  "fp16":  True,
  "save_steps":  -1,
  "save_eval_checkpoints":  False,
  "save_model_every_epoch":  False,
  "silent":  True,
}

os.environ["TOKENIZERS_PARALLELISM"] = "false"

roberta_model = ClassificationModel(model_type="roberta",
                                    model_name="roberta-base",
                                    # Use GPU if available
                                    use_cuda=torch.cuda.is_available(),
                                    args=model_args)

We have constructed a training and test set from the movie review corpus in R above. Now we can train the model on the coded training set and predict the classes for the test set (if you do not have a GPU, this will take a long time, so maybe do it after the course):

# process data to the form simpletransformers needs
train_df = r.imdb_train
train_df['labels'] = train_df['label'].astype('category').cat.codes
train_df = train_df[['text', 'labels']]

roberta_model.train_model(train_df)

# test data needs to be a list
test_l = r.imdb_test["text"].tolist()
predictions, raw_outputs = roberta_model.predict(test_l)
imdb_prediction <- imdb_test |>  
  bind_cols(estimate = factor(c("neg", "pos"))[py$predictions + 1]) |> 
  mutate(truth = factor(label))

saveRDS(imdb_prediction, "data/5_imdb_roberta.rds")
# imdb_prediction <- readRDS("data/5_imdb_roberta.rds")

my_metrics <- metric_set(accuracy, kap, precision, recall, f_meas)

my_metrics(imdb_prediction, truth = truth, estimate = estimate) |>  
  gt() |>  
  data_color(
    columns = .estimate,
    fn = scales::col_numeric(
      palette = c("red", "orange", "green"),
      domain = c(0, 1)
    )
  )
.metric .estimator .estimate
accuracy binary 0.8992000
kap binary 0.7984000
precision binary 0.9082134
recall binary 0.8881600
f_meas binary 0.8980747

Running unsupervised learning with BERTopic

I use the data_corpus_guardian from quanteda.corpora show an example workflow for BERTopic. This dataset contains Guardian newspaper articles in politics, economy, society and international sections from 2012 to 2016.

library(quanteda.corpora)
corp_news <- download("data_corpus_guardian")[["documents"]]
corp_news_texts <- corp_news$texts
from bertopic import BERTopic
/home/johannes/Documents/Github/aca_vienna/python-env/lib/python3.11/site-packages/umap/distances.py:1063: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  @numba.jit()
/home/johannes/Documents/Github/aca_vienna/python-env/lib/python3.11/site-packages/umap/distances.py:1071: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  @numba.jit()
/home/johannes/Documents/Github/aca_vienna/python-env/lib/python3.11/site-packages/umap/distances.py:1086: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  @numba.jit()
/home/johannes/Documents/Github/aca_vienna/python-env/lib/python3.11/site-packages/umap/umap_.py:660: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  @numba.jit()
from sentence_transformers import SentenceTransformer
from umap import UMAP

# To make this example reproducible
umap_model = UMAP(n_neighbors=15, n_components=5, 
                  min_dist=0.0, metric='cosine', random_state=42)

# confusingly, this is the setup part
topic_model = BERTopic(language="english",
                       top_n_words=5,
                       n_gram_range=(1, 2),
                       nr_topics="auto", # change if you want a specific nr of topics
                       calculate_probabilities=True,
                       umap_model=umap_model)

# and only here we actually run something
topics, doc_topic = topic_model.fit_transform(r.corp_news.texts)

# save the model
topic_model.save("data/5._bertopic")
# topic_model=BERTopic.load("data/5._bertopic")

Unlike traditional topic models, BERTopic uses an algorithm that automatically determines a sensible number of topics and also automatically labels topics:

topic_model <- py$topic_model
topic_labels <- tibble(topic = as.integer(names(topic_model$topic_labels_)),
                       label = unlist(topic_model$topic_labels_ ))  |> 
  mutate(label = fct_reorder(label, topic))
topic_labels
# A tibble: 93 × 2
   topic label                           
   <int> <fct>                           
 1    -1 -1_the_to_of_and                
 2     0 0_the_to_of_and                 
 3     1 1_trump_clinton_the_in          
 4     2 2_her_she_was_he                
 5     3 3_bank_the_the bank_to          
 6     4 4_police_officers_the_was       
 7     5 5_housing_property_homes_the    
 8     6 6_nhs_care_the nhs_health       
 9     7 7_climate_climate change_the_and
10     8 8_the_tax_pay_to                
# ℹ 83 more rows

Note that -1 describes a trash topic with words and documents that do not really belong anywhere. BERTopic also supplies the top words, i.e., the ones that most likely belong to each topic. In the code above I requested 5 words for each topic:

top_words <- map_df(names(topic_model$topic_representations_), function(t) {
  map_df(topic_model$topic_representations_[[t]], function(y)
    tibble(feature = y[[1]], prob = y[[2]])) |> 
    mutate(topic = as.integer(t), .before = 1L)
})

We can plot them in the same way as in the last session:

library(tidytext)
top_words |> 
  filter(topic %in% c(1, 7, 44, 53, 65, 66)) |>  # select a couple of topics
  left_join(topic_labels, by = "topic") |> 
  mutate(feature = reorder_within(feature, prob, topic)) |> 
  ggplot(aes(x = prob, y = feature, fill = topic, label = label)) +
  geom_col(show.legend = FALSE) +
  facet_wrap(vars(label), ncol = 2, scales = "free_y") +
  scale_y_reordered() +
  labs(x = NULL, y = NULL)

We can use a nice little visualization built into BERTopic to show how topics are linked to one another:

# map intertopic distance
intertopic_distance = topic_model.visualize_topics(width=700, height=700)
# save fig
intertopic_distance.write_html("media/bert_corp_news_intertopic.html")
htmltools::includeHTML("media/bert_corp_news_intertopic.html")

BERTopic also classifies documents into the topic categories (again not really how you should use LDA topicmodels). And provides a nice visualisation for trends over time. Unfortunately, the date format in R does not translate automagically to Python, hence we need to convert the dates to strings:

corp_news_t <- corp_news |> 
  mutate(date_chr = as.character(date))
topics_over_time = topic_model.topics_over_time(docs=r.corp_news_t.texts,
                                                timestamps=r.corp_news_t.date_chr,
                                                global_tuning=True,
                                                evolution_tuning=True,
                                                nr_bins=20)
#plot figure
fig_overtime = topic_model.visualize_topics_over_time(topics_over_time,
                                                      topics=[1, 7, 44, 53, 65, 66])
#save figure
fig_overtime.write_html("media/fig_overtime.html")
htmltools::includeHTML("media/fig_overtime.html")

wrap up

sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: EndeavourOS

Matrix products: default
BLAS:   /usr/lib/libblas.so.3.11.0 
LAPACK: /usr/lib/liblapack.so.3.11.0

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/Berlin
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] tidytext_0.4.1         quanteda.corpora_0.9.2 reticulate_1.28       
 [4] gt_0.9.0               yardstick_1.2.0        workflowsets_1.0.1    
 [7] workflows_1.1.3        tune_1.1.1             rsample_1.1.1         
[10] recipes_1.0.6          parsnip_1.1.0          modeldata_1.1.0       
[13] infer_1.0.4            dials_1.2.0            scales_1.2.1          
[16] broom_1.0.4            tidymodels_1.1.0       lubridate_1.9.2       
[19] forcats_1.0.0          stringr_1.5.0          dplyr_1.1.2           
[22] purrr_1.0.1            readr_2.1.4            tidyr_1.3.0           
[25] tibble_3.2.1           ggplot2_3.4.2          tidyverse_2.0.0       

loaded via a namespace (and not attached):
 [1] tidyselect_1.2.0    timeDate_4022.108   farver_2.1.1       
 [4] fastmap_1.1.1       janeaustenr_1.0.0   digest_0.6.32      
 [7] rpart_4.1.19        timechange_0.2.0    lifecycle_1.0.3    
[10] tokenizers_0.3.0    survival_3.5-5      magrittr_2.0.3     
[13] compiler_4.3.1      sass_0.4.5          rlang_1.1.1        
[16] tools_4.3.1         utf8_1.2.3          yaml_2.3.7         
[19] data.table_1.14.8   knitr_1.42          labeling_0.4.2     
[22] htmlwidgets_1.6.2   xml2_1.3.4          DiceDesign_1.9     
[25] withr_2.5.0         nnet_7.3-19         grid_4.3.1         
[28] fansi_1.0.4         colorspace_2.1-0    future_1.32.0      
[31] globals_0.16.2      iterators_1.0.14    MASS_7.3-60        
[34] cli_3.6.1           rmarkdown_2.21      generics_0.1.3     
[37] rstudioapi_0.14     future.apply_1.10.0 tzdb_0.4.0         
[40] splines_4.3.1       parallel_4.3.1      vctrs_0.6.3        
[43] hardhat_1.3.0       Matrix_1.5-4.1      jsonlite_1.8.7     
[46] hms_1.1.3           listenv_0.9.0       foreach_1.5.2      
[49] gower_1.0.1         glue_1.6.2          parallelly_1.35.0  
[52] codetools_0.2-19    stringi_1.7.12      gtable_0.3.3       
[55] munsell_0.5.0       GPfit_1.0-8         pillar_1.9.0       
[58] furrr_0.3.1         htmltools_0.5.5     ipred_0.9-14       
[61] lava_1.7.2.1        R6_2.5.1            lhs_1.1.6          
[64] evaluate_0.20       lattice_0.21-8      SnowballC_0.7.1    
[67] png_0.1-8           backports_1.4.1     class_7.3-22       
[70] Rcpp_1.0.10         prodlim_2023.03.31  xfun_0.39          
[73] pkgconfig_2.0.3    
py_list_packages()
                     package     version                        requirement
1                    absl-py       1.4.0                     absl-py==1.4.0
2                    aiohttp       3.8.4                     aiohttp==3.8.4
3                  aiosignal       1.3.1                   aiosignal==1.3.1
4                     altair       5.0.1                      altair==5.0.1
5                    appdirs       1.4.4                     appdirs==1.4.4
6              async-timeout       4.0.2               async-timeout==4.0.2
7                      attrs      23.1.0                      attrs==23.1.0
8                   bertopic      0.14.1                   bertopic==0.14.1
9                    blinker       1.6.2                     blinker==1.6.2
10                cachetools       5.3.1                  cachetools==5.3.1
11                   certifi    2023.5.7                  certifi==2023.5.7
12        charset-normalizer       3.1.0          charset-normalizer==3.1.0
13                     click       8.1.3                       click==8.1.3
14                     cmake      3.26.4                      cmake==3.26.4
15                    Cython     0.29.35                    Cython==0.29.35
16                  datasets      2.13.1                   datasets==2.13.1
17                 decorator       5.1.1                   decorator==5.1.1
18                      dill       0.3.6                        dill==0.3.6
19            docker-pycreds       0.4.0              docker-pycreds==0.4.0
20                  filelock      3.12.2                   filelock==3.12.2
21                frozenlist       1.3.3                  frozenlist==1.3.3
22                    fsspec    2023.6.0                   fsspec==2023.6.0
23                     gitdb      4.0.10                      gitdb==4.0.10
24                 GitPython      3.1.31                  GitPython==3.1.31
25               google-auth      2.21.0                google-auth==2.21.0
26      google-auth-oauthlib       1.0.0        google-auth-oauthlib==1.0.0
27                    grpcio      1.56.0                     grpcio==1.56.0
28                   hdbscan      0.8.29                    hdbscan==0.8.29
29           huggingface-hub      0.15.1            huggingface-hub==0.15.1
30                      idna         3.4                          idna==3.4
31        importlib-metadata       6.7.0          importlib-metadata==6.7.0
32                    Jinja2       3.1.2                      Jinja2==3.1.2
33                    joblib       1.3.1                      joblib==1.3.1
34                jsonschema      4.17.3                 jsonschema==4.17.3
35                       lit      16.0.6                        lit==16.0.6
36                  llvmlite      0.40.1                   llvmlite==0.40.1
37                  Markdown       3.4.3                    Markdown==3.4.3
38            markdown-it-py       3.0.0              markdown-it-py==3.0.0
39                MarkupSafe       2.1.3                  MarkupSafe==2.1.3
40                     mdurl       0.1.2                       mdurl==0.1.2
41                    mpmath       1.3.0                      mpmath==1.3.0
42                 multidict       6.0.4                   multidict==6.0.4
43              multiprocess     0.70.14              multiprocess==0.70.14
44                  networkx         3.1                      networkx==3.1
45                      nltk       3.8.1                        nltk==3.8.1
46                     numba      0.57.1                      numba==0.57.1
47                     numpy      1.24.4                      numpy==1.24.4
48        nvidia-cublas-cu11  11.10.3.66     nvidia-cublas-cu11==11.10.3.66
49    nvidia-cuda-cupti-cu11    11.7.101   nvidia-cuda-cupti-cu11==11.7.101
50    nvidia-cuda-nvrtc-cu11     11.7.99    nvidia-cuda-nvrtc-cu11==11.7.99
51  nvidia-cuda-runtime-cu11     11.7.99  nvidia-cuda-runtime-cu11==11.7.99
52         nvidia-cudnn-cu11    8.5.0.96        nvidia-cudnn-cu11==8.5.0.96
53         nvidia-cufft-cu11   10.9.0.58       nvidia-cufft-cu11==10.9.0.58
54        nvidia-curand-cu11  10.2.10.91     nvidia-curand-cu11==10.2.10.91
55      nvidia-cusolver-cu11    11.4.0.1     nvidia-cusolver-cu11==11.4.0.1
56      nvidia-cusparse-cu11   11.7.4.91    nvidia-cusparse-cu11==11.7.4.91
57          nvidia-nccl-cu11      2.14.3           nvidia-nccl-cu11==2.14.3
58          nvidia-nvtx-cu11     11.7.91          nvidia-nvtx-cu11==11.7.91
59                  oauthlib       3.2.2                    oauthlib==3.2.2
60                 packaging        23.1                    packaging==23.1
61                    pandas       2.0.3                      pandas==2.0.3
62                 pathtools       0.1.2                   pathtools==0.1.2
63                    Pillow       9.5.0                      Pillow==9.5.0
64                    plotly      5.15.0                     plotly==5.15.0
65                  protobuf      4.23.3                   protobuf==4.23.3
66                    psutil       5.9.5                      psutil==5.9.5
67                   pyarrow      12.0.1                    pyarrow==12.0.1
68                    pyasn1       0.5.0                      pyasn1==0.5.0
69            pyasn1-modules       0.3.0              pyasn1-modules==0.3.0
70                    pydeck     0.8.1b0                    pydeck==0.8.1b0
71                  Pygments      2.15.1                   Pygments==2.15.1
72                   Pympler       1.0.1                     Pympler==1.0.1
73               pynndescent      0.5.10                pynndescent==0.5.10
74                pyrsistent      0.19.3                 pyrsistent==0.19.3
75           python-dateutil       2.8.2             python-dateutil==2.8.2
76                      pytz      2023.3                       pytz==2023.3
77     pytz-deprecation-shim 0.1.0.post0 pytz-deprecation-shim==0.1.0.post0
78                    PyYAML         6.0                        PyYAML==6.0
79                     regex    2023.6.3                    regex==2023.6.3
80                  requests      2.31.0                   requests==2.31.0
81         requests-oauthlib       1.3.1           requests-oauthlib==1.3.1
82                      rich      13.4.2                       rich==13.4.2
83                       rsa         4.9                           rsa==4.9
84               safetensors       0.3.1                 safetensors==0.3.1
85              scikit-learn       1.2.2                scikit-learn==1.2.2
86                     scipy      1.11.1                      scipy==1.11.1
87     sentence-transformers       2.2.2       sentence-transformers==2.2.2
88             sentencepiece      0.1.99              sentencepiece==0.1.99
89                sentry-sdk      1.26.0                 sentry-sdk==1.26.0
90                   seqeval       1.2.2                     seqeval==1.2.2
91              setproctitle       1.3.2                setproctitle==1.3.2
92        simpletransformers     0.63.11        simpletransformers==0.63.11
93                       six      1.16.0                        six==1.16.0
94                     smmap       5.0.0                       smmap==5.0.0
95                 streamlit      1.24.0                  streamlit==1.24.0
96                     sympy        1.12                        sympy==1.12
97                  tenacity       8.2.2                    tenacity==8.2.2
98               tensorboard      2.13.0                tensorboard==2.13.0
99   tensorboard-data-server       0.7.1     tensorboard-data-server==0.7.1
100            threadpoolctl       3.1.0               threadpoolctl==3.1.0
101               tokenizers      0.13.3                 tokenizers==0.13.3
102                     toml      0.10.2                       toml==0.10.2
103                    toolz      0.12.0                      toolz==0.12.0
104                    torch       2.0.1                       torch==2.0.1
105              torchvision      0.15.2                torchvision==0.15.2
106                  tornado       6.3.2                     tornado==6.3.2
107                     tqdm      4.65.0                       tqdm==4.65.0
108             transformers      4.30.2               transformers==4.30.2
109                   triton       2.0.0                      triton==2.0.0
110        typing_extensions       4.7.1           typing_extensions==4.7.1
111                   tzdata      2023.3                     tzdata==2023.3
112                  tzlocal       4.3.1                     tzlocal==4.3.1
113               umap-learn       0.5.3                  umap-learn==0.5.3
114                  urllib3     1.26.16                   urllib3==1.26.16
115               validators      0.20.0                 validators==0.20.0
116                    wandb      0.15.4                      wandb==0.15.4
117                 watchdog       3.0.0                    watchdog==3.0.0
118                 Werkzeug       2.3.6                    Werkzeug==2.3.6
119                   xxhash       3.2.0                      xxhash==3.2.0
120                     yarl       1.9.2                        yarl==1.9.2
121                     zipp      3.15.0                       zipp==3.15.0

References

Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. “Attention Is All You Need.” arXiv. https://doi.org/10.48550/arXiv.1706.03762.